48 research outputs found
Context-dependent articulation of consonant gemination in Estonian
Creative Commons Attribution License (CC BY 4.0)The three-way quantity system is a well-known phonological feature of Estonian. In a number of studies it has been shown that quantity is realized in a disyllabic foot by the stressed-to-unstressed syllable rhyme duration ratio and also by pitch movement as the secondary cue. The stressed syllable rhyme duration is achieved by combining the length of the vowel and the coda consonant, which enables minimal septets of CVCV-sequences based on segmental duration. In this study we analyze articulatory (EMA) recordings from four native Estonian speakers producing all possible quantity combinations of intervocalic bilabial stops in two vocalic contexts (/alpha-i/ vs. /i-alpha/). The analysis shows that kinematic characteristics (gesture duration, spatial extent, and peak velocity) are primarily affected by quantity on the segmental level: Phonologically longer segments are produced by longer and larger lip closing gestures and, in reverse, shorter and smaller lip opening movements. Tongue transition gesture is consistently lengthened and slowed down by increasing consonant quantity. In general, both kinematic characteristics and intergestural coordination are influenced by non-linear interactions between segmental quantity levels as well as vocalic context.Peer reviewe
Creak as a feature of lexical stress in Estonian
Peer reviewe
Articulatory Consequences of Vocal Effort Elicitation Method
Articulatory features from two datasets, Slovak and Swedish, were compared to see whether different methods of eliciting loud speech (ambient noise vs visually presented loudness target) result in different articulatory behavior. The features studied were temporal and kinematic characteristics of lip separation within the closing and opening gestures of bilabial consonants, and of the tongue body movement from /i/ to /a/ through a bilabial consonant. The results indicate larger hyper- articulation in the speech elicited with visually presented target. While individual articulatory strategies are evident, the speaker groups agree on increasing the kinematic features equally within each gesture in response to the increased vocal effort. Another concerted strategy is keeping the tongue response at a minimum, presumably to preserve acoustic prerequisites necessary for the adequate vowel identity. While the method of visually presented loudness target elicits larger span of vocal effort, the two elicitation methods achieve comparable consistency per loudness conditions.Peer reviewe
Hierarchical Representation and Estimation of Prosody using Continuous Wavelet Transform
Prominences and boundaries are the essential constituents of prosodic struc- ture in speech. They provide for means to chunk the speech stream into linguis- tically relevant units by providing them with relative saliences and demarcating them within utterance structures. Prominences and boundaries have both been widely used in both basic research on prosody as well as in text-to-speech syn- thesis. However, there are no representation schemes that would provide for both estimating and modelling them in a unified fashion. Here we present an unsupervised unified account for estimating and representing prosodic promi- nences and boundaries using a scale-space analysis based on continuous wavelet transform. The methods are evaluated and compared to earlier work using the Boston University Radio News corpus. The results show that the proposed method is comparable with the best published supervised annotation methods.Peer reviewe
The acoustic basis of lexical stress perception
Peer reviewe